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Abstract 

We consider a change detection problem in which the arrival rate of a Poisson process 
changes suddenly at some unknown and unobservable disorder time. It is assumed that the prior 
distribution of the disorder time is known. The objective is to detect the disorder time with an 
online detection rule (a stopping time) in a way that balances the frequency of false alarm and 
detection delay. So far in the study of this problem, the prior distribution of the disorder time 
is taken to be exponential distribution for analytical tractability. Here, we will take the prior 
distribution to be a phase-type distribution, which is the distribution of the absorption time 
of a continuous time Markov chain with a finite state space. We find an optimal stopping rule 
for this general case and give a numerical algorithm that calculates the parameters of e-optimal 
strategies for any e > 0. We illustrate our findings on two examples. 

1 Introduction 

Suppose that our observations come from a Poisson process X = {Xt : t > 0} whose arrival rate 
changes from Ao to Ai at some random time 0. The disorder time O is unobservable but its prior 
distribution is known. We assume that the prior distribution of is a phase-type distribution. 
This is the distribution of the time of death (absorption) of a non-conservative Markov process 
M = {Mt : t > 0}, whose state space is finite and includes a single absorbing state. Our problem 
is to find an alarm time r which depends only on the past and the present observations and rings 
as soon as occurs. Since is unobservable a detection rule r will make false alarms or have 
detection delays. We will find a rule that optimally balances these two. We will choose a Bayesian 
risk that penalizes the sum of the frequency of false alarm and a multiple of detection delay as in 
Peskir and Shiryaev (2002). 
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So far in the literature of continuous time Bayesian quickest detection problems, the distribution 
of the disorder time is always taken to be exponential distribution for analytical tractability, see 
e.g. Galchuk and Rozovsky (1971), Davis (1976), Shiryaev (1978), Peskir and Shiryaev (2000, 2002), 
Beibel (2000), Karatzas (2003), Bayraktar et al. (2005, 2006), Dayanik and Sezer (2005), Bayraktar 
and Dayanik (2006). The disorder time in the works cited above is modeled as the first arrival time 
of a Poisson process that we do not observe. We will change the assumption on the nature of the 
arrivals for broader applicability and we will solve the Poisson disorder problem with a phase type 
disorder distribution. This seems to strike a balance between generality and tractability. Indeed, 
any positive distribution may be approximated arbitrarily closely by phase-type distributions. See 
Neuts (1989) for this and other properties of this class of distributions. 

Let {1, ■ ■ ■ , n, A} denote the state space of M where A is absorbing and the rest of the states 
are transient. To solve the Poisson disorder problem, we first show that it is equivalent to an 
optimal stopping problem for an n + 1 dimensional piece-wise deterministic Markov process lit = 
[11^, • • • IIj , ILJ, t > 0, whose ith coordinate is the posterior probability Ilj: = F n {M t = i} that 
the Markov chain M is in state i given the past observations jF t = cr{X s : < s < t} of X. 
The process IT, t > 0, is the posterior probability that the disorder has already occurred. All of 
the coordinates are driven by the same point process. We show that the optimal stopping time 
(of the filtration F = {J-t}t>o) is the hitting time of the process II to some closed convex set T 
with non-empty interior. We describe a numerical algorithm that approximates the optimal Bayes 
risk within any given positive error margin. Among the outputs of this algorithm are boundary 
curves that characterize e— optimal stopping times. Once these curves are determined the only 
thing an observer has to do is to ring the alarm as soon as II, which is completely determined 
by the observations of X, crosses one of these boundaries, continuously or via a jump. To see the 
efficacy of the numerical algorithm we use it to approximate the minimum Bayes risk when the prior 
distribution of the disorder time has Erlang or Hyper geometric distribution with two non-absorbing 
states. 

The rest of the paper is organized as follows: In Section 2, we give a precise description of the 
problem and show that it is equivalent to solving an optimal stopping problem for the process II. 
In Section 3, we show that the minimum Bayes risk can be uniformly approximated by a sequence 
of functions that can be constructed via an iterative application of an integral operator to the 
terminal penalty of the optimal stopping problem described in Section 2. A similar sequential 
approximation technique was employed by Bayraktar et al. (2006) in solving a Poisson disorder 
problem in which the disorder distribution was exponential and post disorder arrival rate was a 
random variable. The authors formulated the problem under an auxiliary probability measure as an 
optimal stopping time of an M + -valued odds-ratio process. If we used a formulation similar to theirs, 
we would obtain an optimal stopping problem with an unbounded continuation region. Therefore, 
that formulation is not suitable for numerical implementation. Also, the optimal stopping problem 
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we consider involves a terminal penalty term and a running cost with no discount factor. In this 
section, we also show that an optimal stopping time exists, and we describe two different types of 
e-optimal stopping times. In Section 4, we describe a numerical algorithm that can approximate 
the optimal Bayes risk to a given level of accuracy. Finally, Section 5 provides several examples 
illustrating our solution. Appendix is home for the longer proofs. 



2 Problem Statement 



(OK 

t )t>0; 



and 



Let (fi, J 7 , P) be a probability space hosting two independent Poisson processes (X 1 
(X^)t>o with intensities Ao and Ai respectively, and an independent continuous-time Markov 
chain M = (M t )t>o with state space 



£={1,2,- ,n,A}. 



(2.1) 



Here, A is an absorbing state and all the other states are transient. The infinitesimal generator of 
M, which we denote by A = (qij)ijeE, is of the form 

/ \ 

R r 



A 











(2.2) 



where the n x 1 vector r is non-negative, and the n x n matrix R is nonsingular. The matrix R has 
negative diagonal and nonnegative off-diagonal entries. Moreover R and r satisfy R- 1 + r = 0. 
For a point tt = [ttx, 7T2, • • • , n n , tt] in 



(2.3) 



i=l 



let P T denote the probability measure P such that the process M has initial distribution tt. That 
is, 



^{A} = TTi F{A\M Q = 1} + . . . + vr n F{A\M = n} + vrP{^|M = A}, 



(2.4) 



for all A £ T . The absorption time of M is defined as = inf{t > : Mt = A}, and its distribution 
is denoted by 



F#(t) = F n {Q <t} = l- [vri,7r 2 , ■ • • ,TT n ] ■ exp(tR) -1, < t < oo. 

Here, is said to have a phase- type distribution, see e.g. Neuts (1989). 
The processes X°, X 1 and M are unobservable. Rather we observe 

ft rt 



(2.5) 



X t = [ l {s<e} dxi 0) + [ l {sm dxi 1] , 
Jo Jo 



(2.6) 
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whose natural nitration will be denoted by F = {J~t}t>o- Let us define G = {Gt}t>o as an initial 
enlargement of F by setting Qt = Tt V o~{Mt : t > 0}. That is; Qt is the information avail- 
able to a genie at time t who is given the paths of the process Mt,t > 0. If the paths of Mt, 
t > 0, are available at time 0, then the observations come from a process X that is a Poisson 
process with rate Ao on the time interval [0, G) and with rate Ai on [G, oo) for known positive 
constants Ao and Ai. Specifically, the observation process X is a counting process such that 
X t — Jq [Aol{ s< 0} + Ailr s >Qi] ds, t > is a (P 77 , G)-martingale. The crucial feature here is that 
O is neither known nor observable; only the process X is observable. The problem is then to find 
a quickest detection rule for the disorder time G, which is adapted to the history F generated by 
the observed process X only. A detection rule is a stopping time r of the filtration F, and we will 
denote the set of these stopping times by S. Our objective is to find an element of S minimizing 
the Bayes risk 

R T (jr) < Q} + c E^(r-Q) + , (2.7) 

for some positive constant c. Here a + = max(a, 0) for any a€l. The first term in (2.7) penalizes 
the frequency of false alarms and the second term penalizes the detection delay. 

Remark 2.1. In order to minimize R t (tt) in S, it is enough to consider stopping times with 
bounded expectation. Indeed, z/E^jr} > 1/c + E{Q}, then R t (tt) > c(E{t} — E{Q}) > 1, which 
is greater than the cost incurred upon stopping immediately. In the remainder we will use Sf to 
denote the class of ¥ -stopping times whose expectation are strictly less than or equal to l/c + E{Q}. 

Our objective is then to compute 

V(tt) = inf R t (tt) = R T *{n), for all tt G D, 

reS f 

and to identify a rule r* (if there exists one) for which this infimum is attained 
have < V(n) < 1 for all tt G D. 

Remark 2.2. Let us introduce the posterior probability distribution Jit — [IIj , 
where 

lit = P*{6 < t\F t } = {M t = A\T t } , and uf } = P* {M t = i\T t } , t > 0, (2.9) 
for i £ {1, • • • , n}. Using the identities P 7? {r < 0} = E 7? {1 - II T } and 

E -{(r - G)+} = E* l {e < t} dt\ =^{1™ l {0 < t }l{r> t }<4 = E ff t£ U t dt\ , (2.10) 

we can represent the value function in (2.8) in terms of posterior probability distribution as 

V(?r) = inf E ff { I k(U t )dt + h(U T )\ , (2.11) 
re5 / Uo J 



(2. 



Note also that we 



n| n) ,n f ] ; t >o, 
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in which 

k(ir) = C7r, and /i(vr) = 1 — it, (2.12) 

for all 7T = [it i, 7T2, • • • , TT n , 7r] £ D. 

Remark 2.3. It follows from (2.11) that 

V(tt) < h(i?) = 1 - 7T, (2.13) 

/or aZZ 7T = [7Tl, 7T2, • • • , 7T„, 7r] £ -D. 

Lemma 2.1. Zei -us define the hazard rate of the distribution of O as 

F'Jt) 

^= l-FMY f° rt>0 - ( 2 - 14 ) 
T/ien the a-posteriori probability process (ILjt>o satisfies 

du t = [n(t) - (Ax - A )n t ](i - n t )dt + (^ - A o)nt-(i-n t -) dx ^ (2 15) 

Ao(l — lit- J + Aillt_ 

IIo = 7r. 

Proof. We will first introduce a reference probability measure Pq under which the processes M and 
X are independent. Moreover, the probability law of M under Pq will remain unchanged. 
Let us introduce 

Z t = exp {^log dX s - J Ms) - A ]ds| , t > 0, (2.16) 

in which H(s) = Aol{ s< e} + ^il{s>e}- Using the process Z we can define a new probability measure 
Pq on (fl, G) locally in terms of the Radon-Nikodym derivatives 



dF w 



dPJ 



IT = !{0>t} + 1 {0<t}"F ( 2 - 17 ) 



for every < t < oo, where 



Lt A (^ Xt e -(A!-A )t_ (218) 



Under the measure Pq, the process Z is a martingale, X is a Poisson process with intensity Ao and 
is independent of M (see e.g. Section 2 in Bayraktar et al. (2006), or Appendix Al in Dayanik and 
Sezer (2005)). Moreover, and Pft coincide on Q = a{M s ;s > 0}, therefore PJ{6 < t} = F*(t). 

Using the Bayes rule (see e.g. Lipster and Shiryaev (2001)) (this is also known as the Kallianpur- 
Striebel formula) we obtain 

u t = i* { e < m = gi^ig^>, i - n t = !%W*> = ft-**')). (2.i9) 
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Here, to derive the second equality in the second equation we used independence of and X under 
Pf. 

Let us define the odds ratio process 

$t = — < t < oo. (2.20) 
1 — ii t 

Using (2.19) we can obtain a new representation for the odds ratio process 



E5{Z t l {e <t}l^i} 1 

— 



vrL t + / ^FL(s)ds 

JO Lis 



(2.21) 



Here, again we used the independence of and F. The process L = {Lt,t > 0} in (2.17) is a 
(Pq , F)-martingale and is the unique locally bounded solution of the equation 

dL t = [(Ai/Ao) - l]Lt-(dX t - X dt), L = 1; 

see, e.g., Revuz and Yor (1999) or Jacod and Shiryaev (2003). Applying chain rule to (2.21), we 
get 

d^ t = r] (t)(l + ^t)dt + ^(^--l]d(X t -X t), $ = T ^— . (2.22) 



A J 1 - vr 

By an another application of chain rule to (2.20) together with (2.22) we obtain (2.15). □ 

Proposition 2.1. The dynamics of the posterior probability distribution fit = [H^,- ■ -Il[ n \llt], 

t > 0, which is defined in (2.9), is given by 

dU t = it, gjA U? - (A, - A )n t( l - H t ) j dt + { ^0^^ dX t , (2.23) 

r» _ I v^„..tt0') , ^. \.mt.tt(») i m (Ai-A )ni_n 



(0 

-tOT t , (2.24) 
t- 



/or z € {1, • • • , n}, and with Uq = [tvi, ■ ■ ■ ir n , tt]. 

Proof. First, observe that the hazard rate function of the distribution Fjf, can be written as 



„,.S.*%;»». 



On the other hand, 



t {Mt - l} \ ti W {Z t \F t } 

_ Ef{l {Mt=t} |^} _ Ej{l {Mt=l} } _ E*{l {Mt=l} } 
E*{Z t \T t } n{Z t \Tt} n{Zt\Ft) ' 

in which Eg denotes the expectation under the measure Pq which we introduced in (2.17). The 
second equality in this equation follows from Bayes' formula, the third equality follows from the 
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definition of Z in (2.17), the fourth equality follows from the independence of M and X under the 
measure Pq, and, finally, the fourth equality follows from the fact that under the measures P^ and 



s q the law of M is the same. 



Prom (2.19) and (2.26) it is immediate that 



I# Y*{M t = i} , \ 

Then, from (2.26) and (2.27) it follows that 

„..nW 

ti® = • (2-28) 

This equation together with (2.15) yields (2.23). 

We will now derive the dynamics of (11^ )t>0j * G {1) ' • • n }- Let Pij(t) = P^jMj = j\M$ = i) 
denote the transition probabilities of the process M. Recall that t — > Pij(t), t > 0, satisfies the 
forward Kolmogorov equation, i.e., 

^ = E Wfc (t). (2-29) 

k=l 

and that 

n 

P^{M 4 = i} = njPji(t). (2.30) 

3=1 

Now, applying chain rule to (2.27) we obtain 

C) = _ JT« _ E"=i ^ ELi gggCO + EU WiVm dt 

1 i-n t v * ; l-i^W 

" ^ + (1 - n t ) f ^r ( ^: fc)gW + ,(t)^|#l ctt (2.31) 



i-n t v 17 V i i-^(t) 



dt. 



The first line follows from (2.29), and the second follows from (2.30). The last line is a result of 
the identity in (2.27). This equation, together with (2.23) and (2.28) gives (2.24). □ 

Remark 2.4. Let x(t, tt) = (x±(t, tt), • • • , x n (t, n),XA{t, tt)) be the solution of the system of ordinary 
differential equations 

(n \ 

dxA(t, tt) = '^2qjAXj(t,TT) — (Xi — Xo)xA{t,TT)(l — XA{t,Tr)) I dt, with xa(0,tt) = tt, 

fn \ (2 - 32) 

dxi(t, tt) = y^gjjXj(t,7r) + (Ai - X )xA{t,Tj)xi(t) dt, with Xj(0, tt) = m, 
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for i G {1, ■ ■ ■ , n}. Due to Kolmogorov's forward equations, the solution of this system of equations 
can be written as 



x A (t,7r) 

Xi(t,TT) 



7re 



-(Ai-Ao)t + J Q * e -(Ai-Ao)(t-«)i^( s )d 4 



1 - F#(t) + 7re-( A i- A o)* + J* e -( A i- A o)(*- s )F'7r ( s)ds ' 

E"=i*jPji(*) 

1 - Ejf(t) + vre-( A i- A o)* + f* e -( A i- A o)(^) FL(s)ds' 



(2.33) 



fori G {!,•••«}, 



in terms of the transition probabilities Pij(t) = P^jMf = j'|Mo = i} ; for i,j G Moreover, the 
expressions in (2.33) are equivalent to 



and Xi(t,7r) 



{M t = i} 



-(Ai-A o )(t-0)H 



E7f | e -(Ai-A )(t-e)+| ' 
/or z G {1, • • • ,n}. 

Using Markov property of M and (2.34), we have 

n 

F*{M t+s =i} = £>*{M t = j} ■ F*{M t+s = i\M t = j} 

3=1 

n 

= E* | e -( A !- A o)(^) + j £ Xi(tj jf) . p-{M t+s = i|M t = j} 

3=1 

= E* | e -(Ai-Ao)(t-^)+ | . p^>^){M s = i} 

for i < n, and 

E 7f | e -(Ai-A )(i+s-e)+ I = E tF | E 7f j e -(Ai-A o )(t+5-0) 

= E* I E hM^J} ■ { e ~( A - A o)(^) + |M = j} + l {Mt=A} • e -(^o)(t + s-e) 



(2.34) 



(2.35) 



}} 



n 

^P*{Mt = j}E^ j e -( Al - Ao ^- e ) + |M = i} +e -( A i- A o)« .e* |l {t > 0} • e -( A i- A o)(*-e) j 



3=1 

E 



if | e -(Ai-A )(t-e)+J 



n 

£>3'M) -E" {e- (Al - Ao)(s - 9)+ |M = j} +x A (t,7f) 

3=1 



A )(t-e) 

E ^| e -( A i- A o)(t-e)+J . E a(t,ff) | e -(Ai-A )( s -e)+| 



(Ai-Ao)s 



(2.36) 



Using (2.35) and (2.36), it is now easy to see that t i— > x(i, 7?) has the semi-group property 
x(t + s,7r) = x(t, rr(s, 7r)). Then, the dynamics in (2.23), (2.24) and Remark 2.4 imply that IT is 
a piecewise deterministic process whose natural filtration coincides with F. Between two jumps of 
X, the process II follows the curves t i— > x(t,jf), and at arrival times of X, it jumps from one curve 



S 



to another. More precisely, the paths of II have the characterization 

U t = x(t- a m , U am ) , cr m <t < a m+ i , m G N 

( Anil 



v Ao(i - n CTm _) + A x n CTm _ ' ' a (i - n CTm _) + Axn CTm _ ' a (i - n CTm _) + x^^J ' 

(2.37) 

in which 

ct = 0, and a m = inf{i > <r m -i\X t - X t _ > 0}, m G N. (2.38) 
Moreover, for a bounded function g(-), we have 

E* {g(X t+8 - X t )\F t } 

n 

= £ P*{M, = • E* {<?(X t+s - X t )|^, Mi = j} 



+ P*{M t = A\TtW {g{X t+ s - X t )\F t ,M t = A} 



(2.39) 



= ]T Ilf • E* {g(X s )\M = j} +U t -E ff {g(X s )\M = A} = E n < {g(X s )} . 

3=1 

Then, the characterization in (2.37) and (2.38) implies that II is a (P 71 ", F)-Markov process due to 
(2.39). 

3 Sequential Approximation 

Let us define the sequence of functions 

y m (7f) = inf E* \ / k{li t )dt + h(U TAa J } , (3.1) 
re5 / I Jo J 

in which a m , m G No, is defined in (2.38). The functions V m (-), m G No, are non-negative and 
bounded above by h(-). By definition, the sequence {V m } m >i is decreasing and V m > V for all 
m. Therefore the point-wise limit limm^oo V m exists and is greater than or equal to V. In fact a 
stronger convergence result holds as the next lemma shows. 

Proposition 3.1. As m — > oo, the sequence {V m (-)}m>i converges to V(-) uniformly on D. In 
fact, for every m G N 



VmW ~\ (- + V*{8}) max{A °' Al} < y(vf) < V m (n), for all vfG[0,l]" +1 . (3.2) 
V \c J m — 1 

Proof. The second inequality in (3.2) follows immediately, since by definition V m (-) > V(-). Let us 
prove the first inequality. For any r G Sf, the expectation E 71 " | k(flt)dt + /i(fl r )| can be written 
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as 



k(n t )dt + h(u T ) - h(u a j 



E*|y k(U t )dt + h(U TAa Jj +E*h {T>am} 

( l-rt\cr m ~\ 

>^[J o k(a t )dt + /*(IWJ j - E* {l {T>CTm} } , 
since < h(-) < 1. Note that 

which follows as a result of Cauchy-Schwartz inequality, and that 



(3.3) 



(3.4) 



1 ] max|An,Ai) 

E <^ — ^< 1 u ' J . 3.5) 

~m\ m — 1 



Since E*{r} < 1/c + E 7 ^} for any r G 5/, using (3.3), (3.4) and (3.5) we obtain 

e* |Y fc(nt)dt + /i(fi T )| < 



Now taking the infimum of both sides over the stopping rules in 5/, we obtain the first inequality 
in (3.2). □ 

To calculate the functions V m (-) iteratively, we introduce the following operators acting on 

bounded functions w : D — > M 

rthai 



jw(t,iT) = ENy fc(n s )d s + i {t<CTj ^n*) + i {t > CTl} ™(n CT1 )j tG[0,Oo] 



Jtw(ir) = inf Jw(u,n), t G [0, oo]. 

i*6[t,oo] 



(3.7) 



The action of the operator J on the function w can be written as 

Jw(t,jr)= [ F*{s < a 1 }k(x(s,n))ds+ f ^{cn e ds}Sw(x(s,n)) + h(x(t,7r))F*{t < a^, (3.J 
Jo Jo 



in which 



Q / ->\ _A / A 7Ti A 7T n Al7T \ 

^(ttJ - w ^— _ - - — , • • • , Aq(i _ - —, Aq(i _ ?r) + Ai7ry |- l^J 

Let us now compute the distribution and the density of o\ under P 71 ", respectively, since it appears 
in the expression for Jw. We have 

poo 

F*{cri >t}= P^cti > t\6 G ds}W{6 G ds} 
Jo 

f't f'OO 

= / P*{<7i > £|0 = sjP 7 ^ G ds} + / P*{<ri > t|0 = s}P*{0 G ds} 

Jo Jt (3.10) 

= / e-AoSg-AiCt-^pTr^ G ds y + / g-AotpTFjg) £ ds } 
Jo Jt 

= e -Ao *E* |e _(Al_Ao)(t_e)+ } , 
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from which it follows that 

P*{ CTl 6 dt} = e~ Xot [A o -E ff {l {t<0} } + Ai-E^{l {t > 9} e-( Al - A °)(*- )}] dt. (3.11) 

Remark 3.1. For a bounded function w(-), using equations in (2.34), (3.10) and (3.11), it can be 
verified easily that the integrands in (3.8) are absolutely integrable. Hence 

lim Jw(t,Tr) = Jw(oo,tt) < oo, 

and the mapping t — > Jw(t, tt) is continuous on [0, oo]. Therefore, the infimum in (3.7) is attained 
for all t € [0, oo]. 

Remark 3.2. (%) < Jow(-) < /i(-) /or non-negative and bounded function w. 

(ii) For two bounded functions u>i(-) < W2(-), we have Jq,wi(-) < JqW2{-). 

Lemma 3.1. If w : D is positive and concave, then so are the mappings 

tt — > Jw(t, tt) and 7r — > Jqw{tt). (3.12) 

Lemma 3.2. 7/ i/ie function w : D — > is bounded and continuous, then (t,n) — > Jow(t,n) and 
tt — ► Jqw(tt) are also continuous functions. 

Using the operator Jo, let us define a sequence of functions 

t>o(7f) = /i(-7r) and v m (7?) = Joi> m -i(7r), m > 1, for all fr G -D. (3.13) 

Corollary 3.1. i?ac/i u m (-) is positive, continuous, concave on D. The sequence {v m (-)} m >i is 
decreasing, hence the pointwise limit v{tt) = lim^j-^oo tt E D, exists. The function v(-) is 

again concave. 

Proof. The proof easily follows from Remark 3.2, Lemmata 3.1 and 3.2. To prove the concavity of 
v (•) we also use the fact that the lower envelope of concave functions is concave. □ 

The following lemma, which follows from Bremaud (1981) Theorem T.33, characterizes the 
stopping times of piece-wise deterministic Markov processes. Also see Davis (1993), Theorem A2.3. 

Lemma 3.3. For every r € S, and for every m G N, there exists a T ' CTm -measurable random 
variable such that r A a m +i = [a m + R m ) A R m +i, F n -almost surely on {r > a m }. 

Proposition 3.2. For every e > 0, let us define 

4^) - inf { s 6 (0) oo] : Jv m {s, tt) < J v m {^) + e}, tt £ D, (3.14) 

. (r £ n ( 2 (U ) if at > r £ n ( 2 (U ), 

Sf A a! and ^4 ml °' ™ ^J* (3.15) 

I cri + S% ° if o-\ < rU, (n ), 
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where 9 S is the shift operator on Q, i.e., X.% o 9 S = X s+ f Then, for every m > 1 

E* S^J^ k(U t )dt + h(U S ej\ < v m (if)+e. (3.16) 
Moreover, for all m G N, v m (if) = V m (if) on D. 

Proposition 3.3. We have v(if) = V(tt) for every if G D. Moreover, V is the largest solution of 
U = JqU that is smaller than or equal to h. 

Lemma 3.4. For every bounded function if — > w(if), if G D, we have 

J t w(if) = Jw(t,if) +P 7? {o"i > t} ■ {J w(x(t,if)) - h(x(t,if))} , (3.17) 

for all t > 0. 
Corollary 3.2. Let 

r m (n) = inf{t G (0, oo] : Jv m (s,if) = J v m (if)}. (3.18) 

Then 

r m (n) = inf{t G (0, oo] : v m+ i(x(t, if)) = h(x(s, if))}. (3.19) 
Here, we use the convention that inf = oo. 

Remark 3.3. Substituting w = v m in (3.17) yields the dynamic programming equation for the 
sequence of function {f m (-)} m gN / f or every if G D and n G No 

tWfl(7f) = Jv m (t,T?) {ai > t} • K+i(f(t,^)) - /i(f(i,vf))], t G [0,r m (vf)]. (3.20) 

Moreover, if we take w = V in (3.17), then we obtain 

J t V(if) = JV^t, tt) + IT {<Ti > £} • [V(#(t, tt)) - h(x(t, tt))} , t > 0. (3.21) 

Let us define 

r(if) = inf{i G (0, oo] : JV(t, tt) = Jo^(tt)}- (3-22) 
The same arguments as in the proof of Corollary 3.2 leads to 

r(vr) = inf{* G (0, oo] : V{x(t, if)) = h(x(t, if})}. (3.23) 

This equation together with (3.21) yields 

V(tt) = JV(t, if) + P* {o-i > t} • [V(x(t, if)) - h{x{t, if))}, t G [0, r(vf)]. (3.24) 

Remark 3.4. From Propositions 3.1, 3.2, 3.3 and Corollary 3.1 it follows that a continuous 
sequence of functions uniformly converge to v = V . Therefore V is continuous on D. Since 
t — > x(t,if), t > 0, is continuous for all if G D, the mapping t — > V(x(t, if)), t > is also continu- 
ous for all if G -D. Moreover, for every if, the path t — > H^, i > 0, follows the deterministic curves 
t — > x(i, 7?) between the jumps. Hence the process t — > V(IIt) is right- continuous with left limits. 
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Let us define the F-stopping times 

U £ = wf{t > : V(n t ) - h(U t ) > -e}, e > 0. (3.25) 

Remark 3.4 implies 

V(U Ue ) - h(U(U E )) > -e on the event {U £ < oo}. (3.26) 

Proposition 3.4. Let 

L t = [ k(U s )ds + V(U t ), t>0. (3.27) 
Jo 

Then for every m € N, e > 0, ir € D, we have Lq = E v {Ljj £ Aa m } , that is, 

r rU e Aa m -i 

V(tt) = W k(U s )ds + V(U UeAam )\ . (3.28) 

Proposition 3.5. The stopping time U £ , which is defined in (3.25), has bounded P 71 -expectation, 
for every tt S D and e > 0. More precisely, 

E*{U e }<E*{e} + -, 7ceD,£>0. (3.29) 

c 

Moreover, U £ is e- optimal for the problem in (2.11); that is 

' k(U s )ds + h(U Ue )\ <V(f) + £, weD. (3.30) 
Proof. Using Proposition 3.4 and the fact that V is bounded above by 1 



fU e A(T m 

1 > V(t?) = E w <J I k{U s )ds + V(U UEAan 



o 



> E 77 <j / k(U s )ds \ = cE* {(U £ A a m - 9)+} > cE* {U £ A a m — 0} , 



o 



(3.31) 



where we used (2.10) to derive the second equality. Applying monotone convergence theorem as 
m | co, equation (3.29) follows. 

Next, the almost-sure finiteness of U £ implies 

Vffl=fttoW t U k(U s )ds + V(U UeAam )j =E *\j k(0s)d8 + VQlu.)>, (3-32) 

by monotone and bounded convergence theorems, and Proposition 3.4. Since V(Ilu e ) — h(ILu e ) > 
— e, we have 

V(tt) = E* ( I " k(U s )ds + V(U Ue ) - h(n Ue ) + h(U Ue )\ 

Uo Us J (3.33) 

> W U £ k{fi s )ds + h(U Ue )\ ~ e. 

This completes the proof. □ 
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4 Approximating the Value Function to a Given Level of Accuracy 

In this section, we will describe a numerical procedure that approximates the value function within 
any given positive margin, say e, and construct e-optimal stopping strategies. In the next section, 
we will give several examples to illustrate the efficacy of the numerical procedure. 

4.1 Properties of the Stopping Regions 

Let us introduce the stopping and continuation regions for the problem in (2.11) 



Taking e = in Proposition 3.5 implies that Uq is an optimal stopping time of (2.11). From 
Remark 2.2, we see that an admissible rule to minimize the Bayes risk in (2.8) is to observe the 
process X until the process II of (2.23) and (2.24) enters the stopping region V. 

Remark 4.1. Since V and h are continuous (the continuity ofV follows from Remark 3.4), T is 
closed. Moreover, since V is a concave function (see Corollary 3.1 and Proposition 3.3) and h is 
linear, V is a convex set. Indeed, if tti,tt2 € T, then for any a € [0, 1] 

y(avfi + (l-a)7f 2 ) > aV(7ri) + (l-a)V(7f 2 ) = ah^i) + (1 - a)/i(vf 2 ) = h(aifi + (1 - a)vf 2 ). (4.2) 

Since V(tt) < h(ir), for all n € D, this equation implies that V{aTT\ + {\ — a)7r 2 ) = h{aTT\ + {\— a)-7r 2 ). 
Therefore, oax\ + (1 — a)7r 2 G T. 

Proposition 4.1. The stopping region T is not empty. In particular, 



r = {vf g D : V(t?) = h(n)} , C = D \ V. 



(4.1) 




(4.3) 



in which 



B = max g^A- 



Ki<n 



(4.4) 



Proof. For w(-) > 0, using (3.10) and (3.11), we write 



Jw(t, vf) > E' 



7T 






n 



> c 



ds + e 



i=l 



where /«(•) is the probability density function of O given that Mq = i, for i < n. 
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If Ai > Ao, then 



Jw(t,jr)>c vre" AlS (is + e" Al * Vvri / fi(s)ds -. K(t,n), t>0, tt G D. 
Jo i=1 Jt 



(4.5) 



Note that K(0, 7?) = /i(7?). The derivative of with respect to t 



8K 
~dt 



/ n roc n \ 

(t, 7?) = e" Al * cvr - Ax n / - E ^i/iC*) • ( 4 - 6 ) 

V i=l ^ i=l / 



Since /j(i) = dpi^t) /dt, Kolmogorov's forward equation (2.29) implies that 

fi(t) < max q iA = B. (4.7) 

l<i<n 

Therefore, 

^(M0>0 if ->^ 3 ± ^- (4-8) 
at c + Ai + B 

Then for vr > 

K(t,jf) > h(7r) => Jw{t,ir) > /i(tt) => J w(^r) = ^(tt). (4.9) 

Since V(-) = JoV(-), taking w = V in the last equation, we see that if tt > - x^-^g i then 7? = 
(7Ti, • • • ,7r n ,7r) G -D belongs to I\ Similarly, if Ao > Ai it can be shown that if 7r > - +°^+ B , then 
7? = (tt\ , • • • , 7r n , 7r) el) belongs to r. 

□ 

Let us define the optimal stopping and continuation regions for the problems that we introduced 
in (3.1) as 

r m 4 {7? g D : V m (#) = h(#)}, and C m = D\F m , m > 0. (4.10) 

Similar arguments as in Remark 4.1 imply that T m is a closed and convex subset of D for all 
m G No- In fact, these sets are ordered, i.e., 

7t g -D : 7t > r { ^ } lfJ crc...cr m c...c ri cr ,A ( 4.n) 

c + maxjAo, Ai} + B J 
since V(-) < • • • < Vi(-) < Vb(-) = &(•)■ 

4.2 Two Computable e-Optimal Strategies 

The value function V(-), which is defined in (2.8), can be approximated by the sequence {^m(-)} mg pj , 
as Proposition 3.1 suggests. Each element of the sequence {V m } rn&N can be computed by a suc- 
cessive application of the operator Jo, which is defined in (3.7), to the function h(-), see (3.13) and 
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Proposition 3.2. Moreover, the error in approximating V(-) by Om(-)} me pj can be controlled. Due 
to Proposition 3.1, for every e > 0, if we choose A4 £ as 

M £ = 1 + max{A 2 ° ,Al} f - + E* => \\V M - V\\oo = sup \V M (*) - V(n)\ <e, M> M £ . 

(4.12) 

In the next section, we will give a numerical algorithm to compute V\, V2 ■ ■ ■ iteratively. Here, 
we will describe two e-optimal strategies using these functions. 

Recall from Proposition 3.2 that S^, m > 1 are e-optimal stopping times for the problem in 
(3.1). For a fixed e > 0, if we choose M. > M. e /2, we have \\Vm — V||oo < e/2. Then S e J^ is 
e— optimal for V(-) since 

e/2 x 

k(n t )dt + h(n qe/ 2) \ <v m (t?) + J <v(if) + £, neD. (4.13) 

b M j 2 

Note that S £ J^ is not a hitting time. In (3.15), it prescribes to wait until the minimum of r^_ l (Tr) 
and the first jump time a\ of the process X. If r £ J^_ l {j:) < a±, then we stop. Otherwise, the 
probabilities are updated to n CTl and we wait until the minimum of r^ ( 4 _ 1 (n fTl ) and the next jump 
time o"2 = o\ o 9 ai of the process X. If r e J^_ 1 (fi r71 ) comes first we stop. Otherwise we continue as 
before. We finally stop at the Mth jump time if not before. 

We can also give an e-optimal strategy that is a hitting time. Let us define 

Uffi 4 inf [t > : h(U t ) < V M {Ut) + |} • (4-14) 

Following the same arguments as in the proof of Proposition 3.5 this stopping time can be shown to 
be an e/2-optimal stopping time for Vm(-), which in turn implies that it is an e— optimal stopping 
time for V(-). 

4.3 An Algorithm Approximating the Value Function 

Note that if the hitting time of t — > x&(t, n) to the region V is uniformly bounded by some t* < 00, 
then the minimization problem in computing V m+ i(Tr) = inf te j ,oo] JV m (t,Tr) can be restricted to 
the compact interval [0, t*] thanks to Corollary 3.2. Remark 4.2 constructs a uniform bound t* 
when the parameters of the problem satisfy B — \\ + \q > 0, in which B defined as 

B = min OjA. (4.15) 

l<i<n 

Remark 4.2. The hazard rate of the prior distribution of 9 satisfies rj(t) > B (see (2.28)). More- 
over, from (2.15), we have 

dX ^ ] = (r,(t) - (Al - A )x A (t,7f))(l - X A (*,7t)), X A (0,7?) = 7T, (4.16) 
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where x& is defined in (2.32). Let x(t) be the solution of the differential equation 
dx(t) 



dt 



(B - (Ai - X )x(t))(l - x(t)), with x(0) = 0. (4.17) 



A simple comparison argument shows that xa(£) > x (f)> f or all t >0 when B — X± + \q > 0. The 
solution to (4-17) can be written as 



x~(t) = < 

, l+Bt 

When B — \\ + Aq > 0, let us denote 



S—- (l-exp((B-Ai+A )t)) 



1 +B^f^( 1 - cx P(^- Al+A «)*)) (4.18) 
§t if B - Ai + A = 0. 



. 4 ^iy,K« (419) 

c + maxjAo, \\ \ + B 

where B is given in (4-4)- Let t*(n) = inf{t > 0]x&(t,7r) = x}, then using (4-18), it can be easily 
verified that 

lAi, i/ B-A 1 + A =0, 

/or all t? £ D. 

Remark 4.3. — Ai + Ao < 0, rj/ien t*(ir) defined in Remark 4-2 may be oo for some tt € D. 

When B — \\ + Ao < 0, it is still possible to restrict the minimization problem in V n +i(ir) = 
inf tg [o !00 ] JV n (t, 7r) to a compact interval and control the error arising from this. Note that for any 
w(-) < 1, we have 

SUp | Jw(t, 7?) — Jw(oO, 7T ) I 

ireD 

< c / P ff {s < a^ds + / P jf {tri € ds}Sw(x(s, vf)) + (t, vr))P^{t < o"i} (4.21) 

Jt 

<cj^ ^{s <a 1 }ds + 2¥ jf {ai>t} <c e~ x ° s ds + 2e~ Ao< < f-^- + 2^) e~ Ao *, 

where the first inequality follows from (3.8), and the second one follows from the fact that V m < 
h < 1, the third one follows from P^jo"! > t} < e _A °*, which is a direct consequence of (3.10). 
Then, denoting 

w^-lMirk^)' (422> 

we obtain 

\Jw(ti, n) — Jw(t2,Tr)\ < | Jw(ti, tt) — Jw(oo, n)\ + | Jw(t2, vf) — Jw(oo, tt)\ < 5. (4.23) 
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for any t\,t2 > t(6). Letting 

Joftf(vr) = inf Jw(s,tv), for every bounded w : D — > R, and i > 0, 7? G -D, 

s€[0,t] 



(4.24) 



we get sup^g^, \ J j(g\w(T?) — Jow(tt)\ < 5. Now, let us define a new sequence of functions as 

Vs,o(tt) = h(7r) and Vs )m+ i(n) = J ,t(S)Vs,m(^), vf G £>. (4.25) 

Proposition 4.2. For every 5 > 0, m > 0, we /iave 

F m (7?) < V5, m (7r) < m5 + y m (7f), 7? G D. (4.26) 

Proof. For m = we have V^o(-) = Vb(-) = /i(-), by construction. Now, suppose that (4.26) holds 
for some m > 0. Then 



(4.27) 



which proves the first inequality in (4.26) when we substitute m with m + 1. The first inequality 
follows from the induction hypothesis and Remark 3.2. The second inequality follows from (4.24). 

Let us now prove the second inequality in (4.26) when m is replaced by m + 1. Observe that 
V s , m (if) < h(Tr),neD. Then 



Vs, m+ i(Tf) = inf JVs, m (t,n) < inf JVs, m (t,n) + 5 
te{0,t(6)] te[o,oo] 



< inf JV m (t,n) + m5 F n {a 1 £ds} +5 
te[o,t»] [ Jo 

/■oo 

< V m+1 (jr) + m5 / W{a x £ ds} + 5 < V m+1 (n) + (m + 1)5, 

Jo 



(4.28) 



where the second inequality follows from the induction hypothesis and the definition of the operator 
J. □ 

When B — Xi + Xq < 0, using Proposition 4.2 we can approximate the value function V(-) with 
the functions {Vs t m(-)} s>0 m>1 - There is an extra error, because we truncate at t(5), but this can 
be compensated by increasing the number of iterations. Let us define 



M e 4 1 + - 



l + -+E^{0} ) max{A ,Ai} 



and 5 F 



m £ Vm £ - 1 



(4.29) 



Then for every M > A4 e and 5 < 6 e we have 



\\V s ,m ~ V\\oo < \\V s ,m -V M \\ + \\V M -V\\<M5 + J(± + E-{6}) ma ^^ < e , (4.30) 

where we used Propositions 3.1 and 4.2. In other words, by applying the operator Jqhs € ) to the 
function h{-) M e times, we obtain an approximation of V(-) within e-closeness on D. 
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Similar arguments as in Section 3 can be repeated to show that each Vs t m, for m > 1, is 
continuous and concave on D. Moreover, we can still define e-optimal rules using the function 
Vs,M ■ Particularly, let us define the stopping time 

U^ 5) 4 inf [t > : h(U t ) < VsMKt) + |} • (4.31) 

When we take M = M- £ /2 an d & = $e/2> this stopping time becomes an e— optimal stopping time 
for the problem in (2.8). This follows using the same arguments as in the proof of Proposition 3.5. 

Finally, we conclude this section with the following numerical algorithm summarizing the results 
presented here in order to approximate V(-). 

Algorithm. 

1) If B — Ai + Ao > 0, then choose M > M e . B is given by (4.15) and M £ is given by (4.12). 
1') On the other hand if B — \\ + Ao < 0, then choose Ai > Ai £ and 5 < 5 e , in which Ai e and 

5 £ are given by (4.29). 

2) Set V (-) = h(-). 
T) Set ^,o(-) = K-). 

3) Calculate V^ n +i(7r) = min 4g [ ,t*(7f)] JV m (t,n), tt G D, in which t*(vr) is given in Remark 4.2 
(see also (4.20)). 

3') Calculate Vs, m +i = ^te[o,t(8)] JVs,n(t,K), tt G D, in which t(5) is defined in (4.22). 

4) Repeat step 3 until m = Ai + 1. 
4') Repeat step 3' until m = M + 1. 

If B — Ai + Ao > 0, our algorithm returns Vm, which satisfies \\Vm — V\\ < e. On the other 
hand if B — Ai + Aq < 0, the algorithm returns Vs t Mi which satisfies ||V£ a<( — V|| < e. 



5 Examples 

In this section, we provide examples illustrating the use of the numerical algorithm presented above 
for negligible e- values. 

5.1 Mixed Erlang distribution 

In (2.2) let us take a particular form for A where all entries are zero except qu = —A, g^j+i = A for 
some rate A > 0, and for i = 1, . . . ,rt. Then, starting from any non-absorbing state i, the process 
M visits all the states i + 1, i + 2, . . . until it eventually hits the absorbing state A. In other words, 
conditioned on any initial non-absorbing state i, the disorder time has Erlang distribution with the 
shape index n — i + 1 and rate A. In this case the distribution of O can be explicitly given as 

/>£ \72~l - 1 % 4-1% % 

F ff (t) = P^O < t} = tt + V; in ■ / fi(u)du, in terms of h{t) 4 — ^-e~ A *, 

rri Vn (n-i)\ 
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for i < n. Moreover, the components of the deterministic path x(-, •) have the explicit forms 

v< r - xt 



ELi E*=i ^^ A 'fSf ) + ^ (Al - Ao)t (t + ELi ** Jo e(^)«A(u)Ai 



Xi(t,n) 



for i <n, and for the n + l'st component we have 

e -(*i-Ao)t + E ^ =1 ^ Jj e (Ai-Ao)« /fc(n)dn 

(ELi E -=i ^" Ai f^yf ) + e"^"** ( ff + EU nk ft ei x^x o) u fk{u)du 

Using these expressions (and assuming tt ^ 1), it can be shown that if A — Ai + Ao > then 
x&(t, 7r) — > 1 as t — > oo. Otherwise, we have 

lim x n (t,TT) = — , and lim XA{t, tt) = — . (5.1) 

t-^oo Ai — Ao i— >oo Ai — Ao 

Due to the explicit form of the paths t t— > x(-, •), the steps described in the numerical algorithm 
above are easier to carry. The Figure 1 below illustrates examples on two different problems where 
there are two transient states. 

In Panels (a) and (b) of Figure 1, we see the sample path behavior and the value function 
of a problem where the parameters are Ao = 6, Ai = 5, A = 3, c = 1. Panel (a) presents the 
behavior of the paths t — > x(t, tt) for a number of different starting points. We also plot a sample 
path of lit starting from a particular point. Since Ao > Ai, the n + l'st component ia of x is 
increasing. In other words, as long as we do not observe any arrival, we tend to assign more 
likelihood the event that the disorder has happened by then. On the other hand, when we observe 
an arrival, we decrease this likelihood. Moreover, since A — Ai + Ao > we see that the paths of 
x converge asymptotically to the point (0,0, 1) as indicated above. In this case, we use steps (1), 
(2), (3) and (4) of the algorithm that is presented at end of Section 4.3 to approximate the value 
function to a given order of accuracy. Thanks to the properties of the approximating sequence 
(see Section 3) , properties such as concavity of the value function and the convexity of the optimal 
stopping boundary are preserved by our approximation. Panel (b), on the right, illustrates the 
(approximated) value function defined on the state space D of II. As the figure shows, the value 
function V(-) is non-negative and concave on D, and there exists a region on the neighborhood of 
the point (0, 0, 1) where it coincides with the terminal reward function /i(-). As indicated in Section 
4, an (e-)optimal strategy then implies that one observes the counting process X, and update the 
process IT continuously until ll enters the region T. At this time, we stop and declare that the 
disorder has happened by then. 
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$V(\cdot)$ 




$(0,1,0)$ $(1,0,0)$ $\Gamma$ $(0,1,0)$ 



(c) (d) 

Figure 1: Examples with mixed Erlang prior distributions. Panels (a) and (b) correspond to a 
problem with Ao = 6, Ai =5, A = 3, c = 1. Panel (a) represents the sample path behavior of 
1 1 — > x(t,-fr) and 1 1 — > lit- The continuous curves are possible sample paths of x starting from 

different points. The discontinuous path with arrows indicates the behavior of II. As indicated in 

Section 2, between two jumps, the process II follows the deterministic curves of x, and at jump 
times it switches from one curve to another. Panel (b) gives the value function V(-) and the 

stopping (C) and continuation (T) regions. Similarly, Panels (c) and (d) correspond to another 

problem where Ao = 5, Ai = 10, A = 3, c = 1. 

Panels (c) and (d), on the other hand, correspond to another sample problem where Ao = 5, 
Ai = 10, A = 3, c = 1. In this case, we have Ai > A + Ao- Therefore, the paths t \— » x(t,Tr) are 
asymptotically converging to the point (0,0.4,0.6) as indicated in (5.1). Moreover, since Ai > Ao, 
the n + l'st component IT of II moves closer to the point (0,0, 1) at jump times. In this case we 
use steps (1'), (2'), (3') and (4') of the algorithm that we presented at the end of Section 4.3 to 
approximate the value function. In Panel (d), we verify our claims in Section 3 again. That is; the 
value function is a concave function and the stopping region T is a convex region around the point 
(0,0,1). 
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5.2 Hyperexponential distribution 

Let us here reconsider the formulation of the classical Poisson disorder problem with exponential 
prior distribution, and let us assume that the rate of this exponential distribution is not known 
precisely. Rather there are n possible rates /ii, /i2, • • • , /i n with prior likelihoods (tti, . . . ,ir n , it), and 
the aim is to detect the change time O by minimizing R t (tt) in (2.7). 

This problem can be modeled as a special case of phase-type Poisson disorder problem if we 
take column vector r in (2.2) in the form r = [pi, /i2, • • • , /in]' f° r /ij > for i = 1, . . . ,n. Moreover 
we let the matrix R in (2.2) be R = —r' ■ I, where / is nxn identity matrix. In this case, if the 
process M starts from a transient state, it is absorbed to the state A at the first transition time, 
and conditioned on the initial state i the hitting time has exponential distribution with parameter 
/ii- 

In this case, by direct computation it can be shown that the deterministic paths x(-, •) has the 
form 

x (t tt) = - (5 2) 

(E5U 7r fc e-<*t) + e -(Ai-Ao)t ^ + £jj =1 vr, Jj e (Ai-Ao)«/ fc ( u ) dti ) ' 

for 1 < i < n and 

e -(A 1 -A ) i U + ELi *k Si e^- Xo)u f k (u)du] 



(ELi ^e-^*) + e-(Ai-Ao)t ^ + ELi ^ £ e(^- A o)-/ fc (n)dn) 



where /fc(n) = /ifce _MfeM , for 1 < A; < n. 

Without loss of generality let us assume that \i\ > fi2 > ■ ■ ■ > /in- Then, on {7? € D : 7r n 7^ 0}, 
the path Xj(t, 7f) goes to as t — > 00 for i = 1, . . . , n — 1. If /i n — Ai + Ao > 0, then Xi(t, tt) converges 
to 1 asymptotically, otherwise we have 

lim x n (t,7r) = 1 ^" -, and lim X/\(t,7r) = — (5-3) 

t^oo Ai — Ao t^oo Ai — Ao 

On the other hand, on the region {7? € D : 7r„ = 0,7r n „i 7^ 0}, the above statements hold by 
replacing n and /i n with n — 1 and /i n _i respectively, and so on. If a non-absorbing state has the 
initial likelihood vr* = 0, then H® = 0, for all* > by (5.2) and (2.37). Indeed, since the disorder 
occurs at the first transition time, this state can be eliminated from the problem. 

Figure 2 presents two numerical examples with two transient states. In Panels (a) and (b), we 
see the value function and the paths x(-, •) of a problem where the parameters are /ii = 3, ix% = 2, 
Ao = 2, Ai = 1, c = 1.5. Between two jumps, the process II follows the paths t 1— > x(t,jr), which 
are converging to the point (0,0, 1) asymptotically. Moreover since Ai < Ao, the process II jumps 
away from this point, and we decrease the conditional likelihood of the disorder event at arrival 
times of X. In this case, we use steps (1), (2), (3) and (4) of the algorithm that is presented at end 
of Section 4.3 to approximate the value function. In Panel (b), we observe that the value function 
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is concave, and the stopping region is a convex region with non-empty interior around the point 
(0, 0, 1) as indicated in Section 3. 




$(0,1,0)$ $(1,0,0)$ $(0,1,0)$ 



(c) (d) 

Figure 2: Examples with hyper- geometric prior distributions. In panels (a) and (b) we see the 
sample path properties and the value function of a problem where fi\ = 3, /i2 = 2, Ao = 2, Ai = 1, 
c = 1.5. Continuous paths in Panel (a) are the paths oft t— > x(t,n) for different starting points. 
The discontinuous path with arrows is a sample path oft^Tlf Panel (b) illustrates the value 
function V(-) and the stopping (C) and continuation (T) regions. Similarly, in Panels (c) and (d) 
we see another problem with a\ = 3, fi2 = 2, Ao = 2, Ai = 6, c = 1.5. 

In panels (c) and (d) we have another problem whose parameters are /ii = 3, U2 = 2, Ao = 2, 
Ai = 6, c = 1.5. In this case we have [i n — X± + Ao < 0. Hence, in accordance with (5.1) we see 
that the paths t i— > x(t,ir) converge to the point (0,0.5,0.5). Also, since Ai > Ao, at the jump 
times of X, the process II jumps towards the point (0,0, 1) and the conditional probability of the 
disorder event is increased. In this case we use steps (1'), (2'), (3') and (4') of the algorithm that 
we presented at the end of Section 4.3 to approximate the value function. In Panel (d), we verify 
once again the concavity of the value function and convexity of the stopping region around the 
point (0,0, 1). 

Non-smooth behavior of the value function on the region where tt\ = is in accordance with 
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Lemma 7.1 of Dayanik and Sezer (2005). On this region, the problem is essentially with one non- 
absorbing state. The point n = (0,1 — ^2/(^1 — Ao),£i2/(Ai — Ao) ) = (0,0.5,0.5) falls into the 
continuation, and the function is not differentiable at the boundary point of this line segment. 



6 Appendix 



Proof of Lemma 3.1. Using (2.34) and (3.10) we write 

E*{e 



t rt 

P*{s < <ri}k(x(s, n))ds = / e~ x ° s • 1** i P -(Ai->o)(»-e)+ 

Jo 

t . o -(Ai-A )(s-e)| 



| k(x(s, Tr))di 



dse-*» E{e -(Xi-Xo)(s-e) +} ^{\{s>e} e - 

E 7r {e-( Al - A o)( s - ) + } 

e~ Xlt ) +cY^Ki J* e~ x ° s QT e-^-^'-^fi(u)du ) ds, 



(6.1) 



where fi(-) is the probability density function of O, given that M = i. Therefore, 7? -> / P*{s < 
£7i } k(x(s, 7?)) is linear. Next, we observe that 



/i(x(t, 7 r))F r {t < cxi} = (1 -x A (t,7f)) -e"^ -E"|e- (Al - Ao)( *- e)+ } 

n POO 

= e~ Xot .E*{l {t<e} } = e-^J^TTi / 

hence the mapping 



i=l 



7f — » 7i(d?(t, 7f))P 7r (t < £7i ) is linear. 
Finally, let u>(-) be a positive and concave function. Then it can be written as 



;vf) = inf + /f Vi + • • • + + /?i fc) 7r) , 

,(*) 



(6.2) 



(6.3) 



(6.4) 



for some index set if and constants /3j . Using this representation of w(-), (2.34), and (3.11) we 



obtain 

ft 







lT(<ri € ds)Sw(x(s,n)) 



* TO 7fr c , 1 f A Xi(s,7Ti) Ai»a(s,7Ti) 

P {£7! € cis}^ — — -. r~r -. r, • • • , — 7- : rr z ; 

VA (1 -x A (s,7r)) + Aixa(s,vt) A (l -£ca(s,7tJ) +Aix A (s,7r y 

t 

dse~ x ° s 







inf 



A o E*{l {s<0} } + AiE* \ l {s > 0} e- (A ^ Ao)(s - 0) 



(fe) /?i fe) AoE-{l {A/s=1} } + ...+/? A AiE-{l {s > e}e -( A ^ A o)( s - )} 



A E- {l {s<e} } + AiE^{l {s > e} e-(^-Ao)( S -e)} 



(6.5) 



^ d se - AoS l mf./3f (^AoE^{l {s<e} } + AiE-|l {s >e } e 

+ ) A E ff {l {M . =1} } + • • • + /? A AiE* { l {s > 0} e 



-(Ai-Ao)(s-9) 



-(Ai-Ao)(s-9) 
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(6.7) 

e -(A 1 -A )(«-u)yr.( ti J \ Sw{x{s,n))d S . 



Note that the term inside the parentheses is linear in tt due to (2.4). Hence it follows that 

7? — > / P 7r {cri € ds}Sw(x(s, 7r)) is concave, (6-6) 

JO 

since the lower envelope of linear functions is concave. 

As a sum of three concave mappings, tt — » Jw(t,n) is concave for all t > 0. Also, as the lower 
envelope of concave functions, the mapping tt — > Jqw(tt) = inf^>o J(t,n) is again concave. □ 

Proof of Lemma 3.2. Let u;(-) be a bounded continuous function. Then as in (6.5), we have 

ft ft I 11 fOO \ 

y P 7? (o"i G ds)Sw(x(s,Tr)) = J ^e~ X ° s \Y^^i J fi(u)duj Sw(x(s,ir))ds + 

f Ai (vre- AlS + e- Aos V / 
Jo V i=i ^ 

where fi(-) is the probability density function of G, given that Mo = i. Then using (6.1), (6.2) and 

(6.7), it can easily verified that the mapping (t, tt) — > Jw(t,7r) is jointly continuous on R + x D. 

The mapping (i, 7r) — > Jw(t, tt) is then uniformly continuous on [0, k] x D for all G N. Therefore, 

the mapping 

tt — > Jo kU>(Tr) = inf Jw(t,n) is continuous on L>. (6-8) 

te[o,fc] 

On the other hand, using (3.8) and (6.2) we can write 

Jw(t, 7r) = Jw(t A /c, 7r) + / F*{s <ai}k(x(s,n))ds+ 

JtAk 

(fOO fOO \ ft 

/ fi(s)ds- fi(s)ds)+ F^(a 1 eds)Sw(x(s,7r)) 
Jt JtAk J JtAk 

n fOO ft 

> Jw(t Afc,7f) -e- Aofe Vvr / /i(s)ds - (A V Ai) / e -( A ° AAl > • ||w|| (is 

> Ju;(t Afc,7f) -e- (A()AAl)fc (l + (A VAi) • |H|), 

where = sup^g^ |w;(7r)|. By taking the infimum on both sides of (6.9) we get 

JoM*) > Jow(n) > JoM*) ~ e-( AoAAl ) fc , (6.10) 

which implies that Jq^w{-) — > Jqw(-) uniformly on D. This fact together with (6.8) implies that 
7T — ► Jqw(tt) is continuous on L>. □ 

Proof of Proposition 3.2. First, we will prove (3.16) by an induction on m G N. For m = 1 the 
left-hand-side of (3.16) becomes 



= J^ (rg(7f),7f) < J vo(n) + e = v 1 (7r) +e, 



(6.11) 
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where we used (3.7), (3.13) and (3.14). Also note that we used Remark 3.1 for the inequality 
above. This inequality in (6.11) proves (3.16) holds for m = 1. Now, suppose (3.16) holds for 
e > 0, and for some m > 1. We will prove that it also holds when m is replaced by m + 1. Since 
Sm+i A <7i = Tm 2 (v?) A o"i, we have 



E 77 



E 71 



E 71 



(vr)Ao-i 



E 71 



in which 



+ /l( % 2 WAa 1 ) 1 {^ 2 W<CTl} 
»-m 2 (7r)Ao-i 



fc(ri t )^ + Mris* 



+ /J ( li 5^ +1 )l{^ +1 < ( , 1 } 



Ku t )dt+h(u ai+sfi(2oeai : 



+ E 7r <M 



{^m 2 (7?)>0-l} 



/m(n, 



/m(7?) = E* 



W2 



fc(n t )rft + /i(n 5£/2 ) ^ < 1^(7?) + e/2, 



(6.12) 



(6.13) 



where the inequality follows from the induction hypothesis, and the last line of (6.12) follows from 
the Strong Markov property of the process II. Then we obtain 



EU / k(U t )dt + h(U S e ) <E- 



rm 2 (7f)A(T 1 



fe(n,)di + Mne/ 2WACTi )i K/2(fio)<(7i} 



+ E 71 M 



{ 1 {r^ 2 (n )>(7 1 } ?;OT ^ <T1 )} + I = J ^( r m 2 (7?)^) + | < MlW+e. 



where the first equality follows from the definition of the operator J in (3.7) and the second equality 
follows from (3.14). This concludes the proof of (3.15). 

The inequality V n < v n follows immediately from (3.16) since < a n by construction. Let us 
prove the opposite inequality V n > v n . First, we will establish 

rrl\a m \ 

W K I k(U t )dt + h(U TAtTm ) } > v m {Tf), (6.14) 



o 



for every m G N, by showing that 

fTA(T m 



> E 71 " 



o 



k(U t )dt + h(U TAan 

rAcr m _fe + i 



k(U t )dt + l {r > CTm _ fe+l} « fc _i(n (Tm _ fc+1 ) + l{ T<CTm _ fe+1 }/i(n T ) \ =: RHSk-i, 

(6.15) 
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for k = 1, • • • + Note that (6.14) follows from (6.15) when we take k = m + 1. For k = 1, 
(6.14) is satisfied as an equality since vq(-) = h(-). Now, let us suppose that (6.14) holds for some 
1 < k < m + 1, and let us prove that it also holds for k + 1. 
Note that RHS k -i can be decomposed as 



RH Sfc^i — RHSu}_i + RHSl\, 



in which 



RHSj}^ = E* 



rA<x m _ fe 



fc(n t )di + i {T<<Tm _ fc} /i(n r ) 



+ ^T^-fc+i^fc-l^m-fc+l) + 1 {r< ( r m _ fc+1 }/i(n T ) 

By Lemma 3.3, there exists an T Um _ k -measurable random variable R m -k such that 

r A cr m _ fe+1 = (<7 m _fc + Rm-k) A cr m _ fc+1 on {r > <r m _fc}. 
Then RHS^^ can be written as RHS^K = 



EN 1 



{T>°"m-fc} 



m— fc+1 



°~m — fc 



^ m — fc + 1 k+Rn 

in which 



| = k w |i {r > . m _ fe} 5 m _ fe (i? m _ fc ,n (7m _ )t )| , 



rAcri 



g m -k(r, vf) = e 7 " |y k(u t )dt + i{ r > ffl }?)n(n ffl ) + i {r<CT1 }/i(n T ; 

= JUjfc_l(r,7f) > J Ufc_l(7f) = Vfc(7f). 



(6.16) 



(6.17) 



(6.18) 



(6.19) 



The second equality in (6.18) follows from the strong Markov property of the process II and the fact 
that the jump times of the observation process X and II are the same. Therefore, the expression 



(2) 

for RHS k ~_ 1 is bounded below as 

rhs®, > e* {i {T > CTm _ fe} ^(n CT _ fc )} • 

Therefore, 



(6.20) 



UTAcr m 
k(U t )dt + h(U TAam )j 



{ I k(U t )dt + l {T<am ^ } h(U T ) + l {T > am _ k} v k (U m _ k )\ (6.21) 
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This completes the proof of (6.15) by induction. Equation (6.14) follows when we set k = n + 1. 
Finally, taking the infimum of both sides in (6.14), we arrive at the desired inequality V n > v n . □ 

Proof of Proposition 3.3. Using Proposition 3.1, Corollary 3.1 and (3.2) we obtain 

u(t?) = lim v m {jT) = lim V m (n) = V(tt), k e D, (6.22) 

m— >oo m— >oo 

which proves the first statement of the proposition. To prove the second statement, we note that 
the sequence {v n } n >i is decreasing and 

V(tt) = v(tt) = inf v m 0ir) = inf inf Jv m -i(t, 7?) = inf inf Jv m -i(t,Tr) 

m>l m>lt>0 (>0m>l 

= inf inf | f P* {s < ax} k(x(s, n))ds 

l>0m>l Jq 

+ jf P"{di Gds}5v m _i(f(s,vr))+P"{t <ai}/i(f(t,7f))| 

= inf | jf P* {s < o-i } jfc(z(«, 7r))ds + jf P* {cii G ds} 5u(^(a, tt)) + P* {t < a x }h(x(t, tt)) J 
= inf Jv(t,ir) = Jnw(vr). 

(6.23) 

This proves that V is a solution of U = JqU. The third line of (6.23) follows from the bounded con- 
vergence theorem. Next, let U be a solution of U = JqU such that U < h. Then by Remark 3.2 we 
have U = JqU < J$h = v±. Now, suppose U < v m for some m > 0, then U = J$U < Jov m = v m +i- 
By induction, we conclude that U < v m , for all m > 1 and therefore U < linim^oo u m = t> = V. □ 

Proof of Lemma 3.4. Let us fix a constant u >t, and tt G D. Then 

Jw(u,a) = e* |y /c(n s )ds + i {u<(Tl} /i(n„) + ^^(n^) j 
= E7f [) Q fe(n s )d s + i {u<ffl} /i(rlj + i {u > CTl}U ;(n a j|+E^|i {(T1>t} y fc(n s )d s j. (6.24) 

On the event {cii > t},we have u A o"i = t + {(« — t) A {a\ o Therefore, the strong Markov 

property implies 

E * {W*} _/ fc(n s )ds| = e* ^ i {CT1 >t } E nt ^ y fc(n s )d s ^ ^ 

= E* {l {CT1>t} [ju;(u - t,n t ) - E* {l {u -t<a l} h(Uu-t) + l^^^OU)}] } (6 - 25) 
= > t}J(u - t,x(t, tt)) - E* {l^}^)} - E* |l {CT1>t} l {u > CTl}U ;(ri CT1 )} , 
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where the second equality follows from the definition of the operator J, and the third from (2.37) 
and the strong Markov property. Substituting (6.25) into (6.24), after some simplification, yields 



Jw(u, ?r) = Jw(t, tt) + ¥^ {cti > t} [Jw(u - t, x(t, vf)) - h(x(t, vf))] . (6.26) 

Now, taking the infimum of both sides over u S [i, oo] concludes the proof. □ 

Proof of Corollary 3.2. Note that by Remark 3.1, we have 

Jv m (r m (7r),jr) = J v m (^) = J rm ^v m (n). (6.27) 



Let us first assume that r m (7?) < oo. Taking t = r m (if) and w = v m in (3.17) gives 

Jv m (r m (ir),7r) = J rm ^v m (Tf) 

= Jv m (r m (ir),Tr) + F n {<ti > r m (ir)} [v m +i(x(r m (n),Tf)) - h(x (r m (7r), tt))] . 

Hence, we have v m+ i(x(r m (ir), tt)) = h(x(r m (n),T?)). 
If < t < r m (7r), then 

Jv m (t,n) > JoV m (n) = Jr m (j?)V m (.n) = J t v m (7r). (6.28) 

Using (3.17) one more time, we get 

J0V m (n) = J t V m (n) = Jv m (t, 7?) + P* {<Ti > t} [v m +l(x(t, 7?)) - h(x(t, if))] . (6.29) 

This equation together with (6.28) implies that v m +i(x(t, 7?)) < h(x(t,7r)) for t S (0,r m (7r)). 

If r m (7r) = 00, then v m+ i(x(t, n)) < h(x(t,7r)) for every t E (0, 00) by the same argument as 
in the last paragraph. The statement of the lemma still holds in this case, since by convention 
inf = 00. □ 



Proof of Proposition 3.4. The proof will be based on an induction. For m = 1, by Lemma 3.3 
there exists a constant u G [0, 00] such that U e A a\ = u l\o\. Then 

{L UeAai } = E* I J k(Il s )ds + V(Il uAcri )j 

= e* |^" ACT1 k(u s )ds + i {u > CTl} y(n CT1 ) + i {u<CTl} /i(n u ) j + {i {u<CTl} [y(n u ) - h(n u )}} 

= JV(u, tt) + P* {n < ai} [F(f(u, vf)) - /i(£ («, tt))] = J u V(tt), 

(6.30) 

where the third equality follows from (3.7) and (2.37). The last equality follows from (3.21). 
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Fix any t 6 [0, it). By (3.21) again 

JV{t, tt) = J t V(n) - {<Ti > t} [V(£(t, if)) - h(x(t, if))] 

> J y(vr) - ^{a l > t] [V(x(t,7r)) - h(x(t,7f))} = J V(n) - V* {l {<n>ty \V0 t ) ~ h(U t )]} ■ 

(6.31) 

On the event {o~x > t} we have U £ > t (otherwise, U £ < t < a\ would imply U e = u < t, and this 
would contradict our initial choice of t < u). Thus, V(Ht) — h(U t ) < —e on {o\ > t}. Hence, 



JV(t, tt) > J V(n) + e¥ T {a l > t} > J V(n), t € [0, u). 

Therefore, JqV{t:) = J u V(ir) and (6.30) implies that 

E* {LusAaA = JuV(n) = J V(tt) = V{#) = L , 

which completes the proof for m = 1. 

Assume (3.28) holds for m > 1. Note that 

E 7 " {Lu e /\a m +i } = E 7 " {l{ Ue<(T1 yL Ue + l{u e ><n}Lu e A<r m+1 } = E 71 " {l{u e «n}Lu e } 



(6.32) 



(6.33) 



+ E" <l {Ue >a l} 



U e A 



eA<J m +l 



k(IL s )ds + V(IL UeA(Tm+i : 



+ E 7r <M {[/s > CTl} / k(n.)d8 



(6.34) 



Since U £ A cr m +i = 0"i + [(U £ A cr m ) o on the event {U £ > 01}, the strong Markov property of II 
implies that 



-U e Aa m 



+ E 7r 4 1 



l {Ue><Tl} 



E n CT1 



A;(n s )ds + V(H 



UeAa-m , 



+ E 7r il 



} ^ fc(II s )djs J> . (6.35) 



By induction hypothesis we can replace the inner expectation with V(n o - 1 ) and obtain 

E^L UeAam+1 } = e* h {Us<ai} L Ue + i {Ue > ai} J* 1 k(n s )ds + v(n ai ) I 

= E 71 " {l{ J 7 £<0 . 1 }L [ / e + l{U e ><ri} L <ri} = E 7 "" {i^AcrJ = L , 



(6.36) 



where the last equality follows from the above proof for m = 1. This completes the proof of the 
statement. □ 
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